Search CORE

39 research outputs found

Recommended from our members

Metrical Grids and Generalized Tier Projection

Author: Hao Yiding
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2020
Field of study

This paper formalizes metrical grid theory (MGT, Prince, 1983; Hayes, 1995) and studies its expressive power. I show that MGT analyses of a certain form can describe stress systems beyond the input tier-based input strictly local functions proposed by Hao and Andersson (2019), but conjecture that such analyses do not describe systems beyond the input tier-based strictly local languages of Baek (2018). These results reveal fundamental differences between the three formalisms

ScholarWorks@UMass Amherst

Recommended from our members

Learnability and Overgeneration in Computational Syntax

Author: Hao Yiding
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2019
Field of study

This paper addresses the hypothesis that unnatural patterns generated by grammar formalisms can be eliminated on the grounds that they are unlearnable. I consider three examples of formal languages thought to represent dependencies unattested in natural language syntax, and show that all three can be learned by grammar induction algorithms following the Distributional Learning paradigm of Clark and Eyraud (2007). While learnable language classes are restrictive by necessity (Gold, 1967), these facts suggest that learnability alone may be insufficient for addressing concerns of overgeneration in syntax

ScholarWorks@UMass Amherst

Rhythmic Syncope in Subregular Phonology

Author: Bowers Dustin
Hao Yiding
Publication venue: ScholarlyCommons
Publication date: 01/10/2020
Field of study

Rhythmic syncope describes the deletion of vowels in an alternating rhythmic pattern, so that every other underlying vowel deletes. We informally summarize a proof that rhythmic syncope cannot be represented by a strictly local function over segments. Rather, rhythmic syncope can only be generated by a strictly local function if input and output symbols are synchronized, so that locality can be computed over both the input and output value at a particular time step. This structural property may only be needed to describe rhythmic syncope, which means that before concluding that human phonology can compute such functions, it is essential to verify the extent to which rhythmic syncope is attested as a stable and productive synchronic pattern

ScholarlyCommons@Penn

Action-Sensitive Phonological Dependencies

Author: Bowers Dustin
Hao Yiding
Publication venue
Publication date: 11/06/2019
Field of study

This paper defines a subregular class of functions called the tier-based synchronized strictly local (TSSL) functions. These functions are similar to the the tier-based input-output strictly local (TIOSL) functions, except that the locality condition is enforced not on the input and output streams, but on the computation history of the minimal subsequential finite-state transducer. We show that TSSL functions naturally describe rhythmic syncope while TIOSL functions cannot, and we argue that TSSL functions provide a more restricted characterization of rhythmic syncope than existing treatments within Optimality Theory.Comment: To appear in the Proceedings of the 16th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morpholog

arXiv.org e-Print Archive

The University of Arizona

Computing Vowel Harmony: The Generative Capacity of Search & Copy

Author: Andersson Samuel
Dolatian Hossep
Hao Yiding
Publication venue: 'Linguistic Society of America'
Publication date: 02/05/2020
Field of study

Search & Copy (S&C) is a procedural model of vowel harmony in which underspecified vowels trigger searches for targets that provide them with features. In this paper, we seek to relate the S&C formalism with models of phonological locality proposed by recent work in the subregular program. Our goal is to provide a formal description, within the framework of mathematical linguistics, of the range of possible phonological transformations that admit an analysis within S&C. We show that used in its unidirectional mode, all transformations described by an S&C analysis can be modeled by tier-based input strictly local functions (TISL). This result improves the previous result of Gainor et al 2012, which showed that vowel harmony processes can be modeled by subsequential functions. However, non-TISL transformations can be given S&C descriptions in the following ways. Firstly, since TISL functions are not closed under composition, a non-TISL vowel harmony pattern may be obtained by applying two S&C rules sequentially. Secondly, when S&C is used in its bidirectional mode, it has the ability to describe transformations that cannot be modeled by finite-state functions

Proceedings Published by the LSA (Linguistic Society of America)

MILL: Mutual Verification with Large Language Models for Zero-Shot Query Expansion

Author: Hao Changying
Jia Pengyue
Li Xiaopeng
Liu Yiding
Wang Shuaiqiang
Yin Dawei
Zhao Xiangyu
Publication venue
Publication date: 13/11/2023
Field of study

Query expansion is a commonly-used technique in many search systems to better represent users' information needs with additional query terms. Existing studies for this task usually propose to expand a query with retrieved or generated contextual documents. However, both types of methods have clear limitations. For retrieval-based methods, the documents retrieved with the original query might not be accurate enough to reveal the search intent, especially when the query is brief or ambiguous. For generation-based methods, existing models can hardly be trained or aligned on a particular corpus, due to the lack of corpus-specific labeled data. In this paper, we propose a novel Large Language Model (LLM) based mutual verification framework for query expansion, which alleviates the aforementioned limitations. Specifically, we first design a query-query-document generation pipeline, which can effectively leverage the contextual knowledge encoded in LLMs to generate sub-queries and corresponding documents from multiple perspectives. Next, we employ a mutual verification method for both generated and retrieved contextual documents, where 1) retrieved documents are filtered with the external contextual knowledge in generated documents, and 2) generated documents are filtered with the corpus-specific knowledge in retrieved documents. Overall, the proposed method allows retrieved and generated documents to complement each other to finalize a better query expansion. We conduct extensive experiments on three information retrieval datasets, i.e., TREC-DL-2020, TREC-COVID, and MSMARCO. The results demonstrate that our method outperforms other baselines significantly

arXiv.org e-Print Archive